A refined information processing capacity metric allows an in-depth analysis of memory and nonlinearity trade-offs in neurocomputational systems

Since dynamical systems are an integral part of many scientific domains and can be inherently computational, analyses that reveal in detail the functions they compute can provide the basis for far-reaching advances in various disciplines. One metric that enables such analysis is the information processing capacity. This method not only provides us with information about the complexity of a system’s computations in an interpretable form, but also indicates its different processing modes with different requirements on memory and nonlinearity. In this paper, we provide a guideline for adapting the application of this metric to continuous-time systems in general and spiking neural networks in particular. We investigate ways to operate the networks deterministically to prevent the negative effects of randomness on their capacity. Finally, we present a method to remove the restriction to linearly encoded input signals. This allows the separate analysis of components within complex systems, such as areas within large brain models, without the need to adapt their naturally occurring inputs.

parameter value description V min 0 mV maximum of the membrane potential distribution V max 20 mV minimum of the membrane potential distribution τ m 20 ms membrane time constant C m 1 pF membrane capacitance E L 0 mV resting membrane potential d 1.5 ms synaptic delay V th 20 mV threshold potential V reset 10 mV reset potential τ ref 2 ms refractory period N 1250 network size N exc 1000 number of excitatory neurons N inh 250 number of inhibitory neurons C exc 100 number of incoming excitatory synapses C inh 25 number of incoming inhibitory synapses g 5 ratio of inhibitory to excitatory weight w exc 0.2 pA excitatory synaptic weight s inh −gw exc = −1 pA inhibitory synaptic weight ν noise 4000 spk/sec rate of background noise

Figures
The lower half of Figure 1 shows the capacity heat maps for a fixed γ value of 1.
We have thus removed all encoder capacities and their delayed versions, resulting in a lower capacity bound. The fact that there are no differences between the results of the different transformation functions tells us that the networks do not compute target functions y l with an encoder capacity C enc (y l ) > 0 better than the encoder does. First, we take a closer look at the reconstructions of the target signal and the squared correlation coefficient, which forms the basis for the capacity evaluation.
The reconstructed function z l is a weighted sum of y l and an uncorrelated noise signal b l : where α l is the relative weight of y l compared to b l and is related to the capacity: Therefore, the relationship between capacity C l and α l is nonlinear and depends on the ratio between the variances of b l and y l , as Equation 2 and also Figure 2A show. Note that this ratio can be different for each target function.

Figure 2: capacity transformation
Following Equation 2 we can define the function f T that calculates α l based on the capacity C l and the ratio of variances β l of b l and y l : To remove the encoder effects, we first calculate the capacity of the encoder output. We use the resulting capacity profile together with the capacities of the overall system to calculate the effective linear memory that the main system introduces in addition to the encoder memory. To do this, we subtract the encoder memory M enc from the combined memory M comb for each delay i (main M i is the capacity with input u delayed by i steps as target function.
The transformation f T allows us to obtain meaningful results when we add, subtract or divide the M values under the assumption that we know β ui . Based on the system and encoder memory values, we calculate the memory ratio γ for all delays, i.e., the fraction of the encoder input that the system can memorize after a delay i: Using these memory ratios, we compute the remembered encoder capacities C rem , i.e., how much a capacity value for a target function can be based on remembering a previous target function that is already computed by the encoder: where C rem (y n ) is the remembered capacity for the target function y n , which corresponds to the target function y m delayed by i steps. These remembered capacities are the result of the linear memory of the system and the nonlinearity and memory of the encoder, and therefore we must subtract them from the capacities of the combined system to obtain the effective capacity of the main system for all objective functions y l : where C comb is the measured capacity of the combined system including the encoder and the main system. With this method we can not get information about the precise functions the system can compute as with a linearly encoded input, because the system does not compute the function F sys (u), but the function F sys (F enc (u)). Therefore a system capacity C sys (y l ) > 0 does not mean that the system computes the specific target function y l with the degree d y l .
However, it tells us that the system computes a function which goes beyond remembering the input signal.
The problem with the transformation f T is that we do not know the variance ratio β and therefore cannot remove the exact encoder effects. However, we test different ways to approximate f T . The simplest approximation is to use the identity function and not transform the capacities before calculating γ and subtracting the encoder values. Other possibilities are to take the square root of the capacities to obtain the correlation coefficient instead of its squared value or to set β to a fixed value (e.g. 1) for each target function. To obtain a lower bound for the capacities, we can set γ to 1 for all linear capacities C[u(k−i)] > 1.
This leads to a complete subtraction of all encoder capacities and their delayed versions and thus to a lower limit for the capacity.